Integration of Collocation Statistics into the Probabilistic Retrieval Model

نویسندگان

  • Olga Vechtomova
  • Stephen Robertson
چکیده

The paper presents a method of combining corpus information on word collocations with the probabilistic model of information retrieval. Corpus term dependencies are used to modify the probabilistic retrieval based on the term independence assumption. Collocates are derived from windows around term occurrences in the corpus. Statistical measures of mutual information and Z score are applied to select significantly associated collocates which are later used in query expansion. The results of the lexico-semantic analysis of significant collocates and their comparison with engineered term networks and thesauri are also discussed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integration of PLSA into Probabilistic CLIR Model - Yokohama National University at NTCIR4 CLIR

In this paper, we propose a method of CrossLanguage Information Retrieval based on an integration of a probabilistic CLIR model and Probabilistic Latent Semantic Analysis (PLSA). PLSA is adopted to extract the information of translation probability from a parallel corpus. The information is utilized in a probabilistic CLIR model. Although the probabilistic CLIR model with PLSA is quite effectiv...

متن کامل

Convergence of Legendre wavelet collocation method for solving nonlinear Stratonovich Volterra integral equations

In this paper, we apply Legendre wavelet collocation method to obtain the approximate solution of nonlinear Stratonovich Volterra integral equations. The main advantage of this method is that Legendre wavelet has orthogonality property and therefore coefficients of expansion are easily calculated. By using this method, the solution of nonlinear Stratonovich Volterra integral equation reduces to...

متن کامل

Analysis of Linear Two-Dimensional Equations by Hermitian Meshfree Collocation Method

Meshfree Collocation Method is used to solve linear two-dimensional problems. This method differs from weak form methods such as Galerkin method and no cellular networking and no numerical  integration. Therefore, this method has no constraints such as the integration accuracy and the integration CPU time, and equations can be isolated directly from the strong form of governing PDE. The fundame...

متن کامل

Solution of nonlinear Volterra-Hammerstein integral equations using alternative Legendre collocation method

Alternative Legendre polynomials (ALPs) are used to approximate the solution of a class of nonlinear Volterra-Hammerstein integral equations. For this purpose, the operational matrices of integration and the product for ALPs are derived. Then, using the collocation method, the considered problem is reduced into a set of nonlinear algebraic equations. The error analysis of the method is given an...

متن کامل

A Novel Multi-modal Integration and Propagation Model for Cross-Media Information Retrieval

In this paper, we present a novel Probabilistic Latent Semantic Analysis-based (PLSA-based) aspect model and turn cross-media retrieval into two parts of multi-modal integration and correlation propagation. We first use multivariate Gaussian distributions to model continuous quantity in PLSA, avoiding information loss between feature-instance versus real-world matching. Multi-modal correlations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000